Towards Extracting Domain Knowledge from C Code
نویسندگان
چکیده
“Writing code is not the problem, understanding the code is the problem” this saying [7] summarizes how important is domain knowledge in software development and maintenance. Gathering this knowledge is an expensive process, which requires an investment of time, money, resources, and which is very demanding, because knowledge is scattered over various locations within source code. In this work, we propose a method to recover domain knowledge from C code using identifiers. To this end, we extract identifiers from source code, and we use them to generate domain concepts and investigate their interrelations. We describe four use cases, in which the domain concepts and their interrelations are typically used. To evaluate the performance of our approach, we conduct two experiments. Both experiments show promising result, in that our approach misses only few relevant concepts, and rarely generates irrelevant concepts. Currently, our approach is not fully automated, because the user has to traverse through a short list that contains both domain concepts and general concepts, and manually remove the general ones.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملIntentional meaning of programs
Software engineering is a quest for appropriate modeling and abstraction. Writing programs that simulate parts of the real world requires programmers to fill the conceptual gap between the domain knowledge and computer languages. As a consequence of the conceptual distance between the business domain and the general purpose programming languages, clearly identifiable concepts at the domain leve...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملA Domain Independent Approach for Extracting Terms from Research Papers
We study the problem of extracting terms from research papers, which is an important step towards building knowledge graphs in research domain. Existing terminology extraction approaches are mostly domain dependent. They use domain specific linguistic rules, supervised machine learning techniques or a combination of the two to extract the terms. Using domain knowledge requires much human effort...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014